Speaker Information using Subsegmental and Segmental Analysis of LP Residual
نویسنده
چکیده
Linear Prediction (LP) residual mostly contains the excitation source information. This work analyzes the LP residual once using frame size of 5 ms (subsegmental) and another time using frame size of 20 ms (segmental), each with a shift of 2.5 ms. The residual frames are then subjected to nonparametric Vector Quantization (VQ) to store the unique excitation sequences for each speaker. The testing of such codebooks seem to contain significant speaker information. Further the speaker information at the two levels are found to be different in nature. The performance of the speaker recognition system using subsegmental, segmental and combined subsegmental-segmental speaker information is found to be 71.67%, 55% and 83.33%, respectively, for a population of 30 speakers taken from TIMIT database using a codebook of size 128. Further combination of the proposed subsegmental-segmental LP residual based system with the vocal tract system feature based system providing stand alone performance of 95% is found to be 100%. This aspect reinforces the different speaker information available in the excitation component of speech.
منابع مشابه
Self Determining Speaker Recognition by Three Level Segmental Processing Of Linear Prediction Residual
This paper proposes a speaker specific source information at different levels.speaker recognition system exploits the source information (LP residual) present at different levels namely subsegmental, segmental &suprasegmental. The subsegmental analysis considers LP residual in blocks of 5 msec with shift of 2.5 msec to extract speaker information. The segmental analysis extracts speaker informa...
متن کاملSubsegmental, Segmental and Suprasegmental Features for Speaker Recognition Using Gaussian Mixture Model
In the feature extraction stage, features representing speaker information are extracted from the speech signal. In the present study LP residual derived from the speech data is used for training and testing and also processing of LP residual in time domain at subsegmental, segmental and suprasegmental levels. In the training phase, GMMs are built, one for each speaker, using the training data ...
متن کاملFeatures for speaker and language identification
Abstract In this paper we examine several features derived from the speech signal for the purpose of identification of speaker or language from the speech signal. Most of the current systems for speaker and language identification use spectral features from short segments of speech. There are additional features which can be derived from the residual of the speech signal, which correspond to th...
متن کاملSubsegmental, Segmental and Suprasegmental Features for Speaker Recognition Using Ergodic Hidden Markov Model
متن کامل
Time -frequency analysis of vocal source signal for speaker recognition
This paper investigates the importance of spectrotemporal characteristics of the source excitation signal for speaker recognition. We propose an effective feature extraction technique for obtaining essential timefrequency information from the linear prediction (LP) residual signal, which are closely related to the glottal excitation of individual speaker. With pitch synchronous analysis, wavele...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009